By @carnby.
This notebook showcases the basic matta visualizations, as well as their usage.
Note that the init_javascript call is not needed when running on local server having added the javascript code to your IPython profile.
In [1]:
import matta
matta.init_javascript(path='https://rawgit.com/carnby/matta/master/matta/libs/')
Out[1]:
Wordclouds are implemented using the d3.layout.cloud layout by Jason Davies. They work with bags of words. The python Counter class is perfect for this purposes.
In [2]:
import requests
hamlet = requests.get('http://www.gutenberg.org/cache/epub/2265/pg2265.txt').text
hamlet[0:100]
Out[2]:
In [3]:
import re
from collections import Counter
words = re.split(r'[\W]+', hamlet.lower())
counts = Counter(words)
In [4]:
from matta import wordcloud
wordcloud(items=counts.most_common(n=1000), typeface='Helvetica', font_scale=0.33, rotation=-7)
Treemaps use the Treemap Layout from d3.js. They work with trees, which we construct through networkx.DiGraph.
In [5]:
import requests
flare_data = requests.get('https://gist.githubusercontent.com/mbostock/4063582/raw/a05a94858375bd0ae023f6950a2b13fac5127637/flare.json').json()
In [6]:
flare_data['name']
Out[6]:
In [7]:
import networkx as nx
tree = nx.DiGraph()
def add_node(node):
node_id = tree.number_of_nodes() + 1
n = tree.add_node(node_id, name=node['name'])
if 'size' in node:
tree.node[node_id]['size'] = node['size']
if 'children' in node:
for child in node['children']:
child_id = add_node(child)
tree.add_edge(node_id, child_id)
return node_id
add_node(flare_data)
Out[7]:
In [8]:
nx.is_arborescence(tree)
Out[8]:
In [9]:
from matta import treemap
treemap(tree=tree, node_value='size', node_label='name', font_size=9, node_border=1, node_padding=0)
Sankey or flow diagrams use the Sankey plugin by Mike Bostock. They work with digraphs, just like treemaps. Note that graphs with loops are not supported.
In [10]:
sankey_data = requests.get('http://bost.ocks.org/mike/sankey/energy.json')
In [11]:
import json
from networkx.readwrite import json_graph
sankey_graph = json_graph.node_link_graph(json.loads(sankey_data.text))
In [12]:
sankey_graph.nodes_iter(data=True).next(), sankey_graph.edges_iter(data=True).next()
Out[12]:
In [13]:
from matta import sankey
sankey(graph=sankey_graph, background_color='#efefef', node_label='name', link_weight='value', node_color='indigo', node_width=8, node_padding=13,
link_color='#aaa', link_opacity=0.75)
Parallel Coordinates are based on the code by Jason Davies. They work with pandas.DataFrame.
In [14]:
import pandas as pd
df = pd.read_csv('http://bl.ocks.org/jasondavies/raw/1341281/cars.csv', index_col='name')
df.head()
Out[14]:
In [15]:
from matta import parallel_coordinates
parallel_coordinates(dataframe=df)
Graphs from networkx.DiGraph are visualized using the Force Layout in d3.js.
In [16]:
graph = nx.davis_southern_women_graph()
In [17]:
for node in graph.nodes_iter(data=True):
graph.node[node[0]]['color'] = 'purple' if node[1]['bipartite'] else 'green'
graph.node[node[0]]['size'] = graph.degree(node[0])
In [18]:
from matta import force_directed
force_directed(graph=graph, link_distance=200, avoid_collisions=True, clamp_to_viewport=True,
background_color='#efefef', node_value='size', node_min_ratio=8, node_max_ratio=36)
In [18]: